Dataset statistics
| Number of variables | 24 |
|---|---|
| Number of observations | 367705 |
| Missing cells | 2191772 |
| Missing cells (%) | 24.8% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 78.2 MiB |
| Average record size in memory | 223.0 B |
Variable types
| CAT | 11 |
|---|---|
| NUM | 8 |
| DATE | 3 |
| BOOL | 2 |
TIPOEMAIL has a high cardinality: 24651 distinct values | High cardinality |
USU_TELF has a high cardinality: 23626 distinct values | High cardinality |
IP_Country has a high cardinality: 82 distinct values | High cardinality |
USU_CIIU has a high cardinality: 595 distinct values | High cardinality |
N_sesiones is highly correlated with Ficha Básica | High correlation |
Ficha Básica is highly correlated with N_sesiones | High correlation |
IP_Area is highly correlated with IP_Country | High correlation |
IP_Country is highly correlated with IP_Area | High correlation |
CANAL_REGISTRO has 7534 (2.0%) missing values | Missing |
IP_Country has 21764 (5.9%) missing values | Missing |
IP_Area has 21764 (5.9%) missing values | Missing |
USU_TIPO has 283591 (77.1%) missing values | Missing |
USU_TAMANIO has 283589 (77.1%) missing values | Missing |
USU_CIIU has 283589 (77.1%) missing values | Missing |
USU_ESTADO has 283589 (77.1%) missing values | Missing |
USU_DEPARTAMENTO has 277197 (75.4%) missing values | Missing |
FEC_CLIENTE has 365092 (99.3%) missing values | Missing |
FEC_ALTA has 363994 (99.0%) missing values | Missing |
Ficha Básica is highly skewed (γ1 = 223.0141021) | Skewed |
N_logins is highly skewed (γ1 = 77.6662565) | Skewed |
N_sesiones is highly skewed (γ1 = 218.6421765) | Skewed |
IDUSUARIO has unique values | Unique |
BONDAD_EMAIL has 54067 (14.7%) zeros | Zeros |
IPCASOS has 15822 (4.3%) zeros | Zeros |
Ficha Básica has 250633 (68.2%) zeros | Zeros |
Perfil Promocional has 24290 (6.6%) zeros | Zeros |
N_logins has 170625 (46.4%) zeros | Zeros |
Reproduction
| Analysis started | 2022-05-15 02:47:43.563358 |
|---|---|
| Analysis finished | 2022-05-15 02:48:59.998141 |
| Duration | 1 minute and 16.43 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 367705 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.8 MiB |
| 7802524 | 1 |
|---|---|
| 7398341 | 1 |
| 7566377 | 1 |
| 7604706 | 1 |
| 7283966 | 1 |
| Other values (367700) |
| Value | Count | Frequency (%) | |
| 7802524 | 1 | < 0.1% | |
| 7398341 | 1 | < 0.1% | |
| 7566377 | 1 | < 0.1% | |
| 7604706 | 1 | < 0.1% | |
| 7283966 | 1 | < 0.1% | |
| 7885379 | 1 | < 0.1% | |
| 7455915 | 1 | < 0.1% | |
| 7941715 | 1 | < 0.1% | |
| 7703927 | 1 | < 0.1% | |
| 7163218 | 1 | < 0.1% | |
| Other values (367695) | 367695 | > 99.9% |
Frequencies of value counts
Unique
| Unique | 367705 ? |
|---|---|
| Unique (%) | 100.0% |
Histogram of lengths of the category
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 7 |
| Min length | 7 |
TIPOUSUARIO
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.8 MiB |
| PF | |
|---|---|
| PJ | |
| PX | 12121 |
| Value | Count | Frequency (%) | |
| PF | 265760 | 72.3% | |
| PJ | 89824 | 24.4% | |
| PX | 12121 | 3.3% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
FEC_REGISTRO
Date
| Distinct | 730 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.8 MiB |
| Minimum | 2018-01-01 00:00:00 |
|---|---|
| Maximum | 2019-12-31 00:00:00 |
Histogram with fixed size bins (bins=50)
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 7534 |
| Missing (%) | 2.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.896654645 |
|---|---|
| Minimum | 1 |
| Maximum | 9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 3 |
| Q3 | 7 |
| 95-th percentile | 8 |
| Maximum | 9 |
| Range | 8 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 2.412749497 |
|---|---|
| Coefficient of variation (CV) | 0.6191848436 |
| Kurtosis | -0.9092001722 |
| Mean | 3.896654645 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.7642777622 |
| Sum | 1403462 |
| Variance | 5.821360133 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=8)
| Value | Count | Frequency (%) | |
| 3 | 118939 | 32.3% | |
| 2 | 84629 | 23.0% | |
| 8 | 47710 | 13.0% | |
| 7 | 37157 | 10.1% | |
| 1 | 36461 | 9.9% | |
| 4 | 16357 | 4.4% | |
| 6 | 12181 | 3.3% | |
| 9 | 6737 | 1.8% | |
| (Missing) | 7534 | 2.0% |
| Value | Count | Frequency (%) | |
| 1 | 36461 | 9.9% | |
| 2 | 84629 | 23.0% | |
| 3 | 118939 | 32.3% | |
| 4 | 16357 | 4.4% | |
| 6 | 12181 | 3.3% |
| Value | Count | Frequency (%) | |
| 9 | 6737 | 1.8% | |
| 8 | 47710 | 13.0% | |
| 7 | 37157 | 10.1% | |
| 6 | 12181 | 3.3% | |
| 4 | 16357 | 4.4% |
IND_CLIENTE
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.8 MiB |
| 0 | |
|---|---|
| 1 | 2619 |
| Value | Count | Frequency (%) | |
| 0 | 365086 | 99.3% | |
| 1 | 2619 | 0.7% |
IND_ALTA
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.8 MiB |
| 0 | |
|---|---|
| 1 | 3710 |
| Value | Count | Frequency (%) | |
| 0 | 363995 | 99.0% | |
| 1 | 3710 | 1.0% |
| Distinct | 24651 |
|---|---|
| Distinct (%) | 6.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.8 MiB |
| gmail.com | |
|---|---|
| hotmail.com | |
| yahoo.com | 6503 |
| yahoo.es | 5239 |
| outlook.com | 4884 |
| Other values (24646) |
| Value | Count | Frequency (%) | |
| gmail.com | 159897 | 43.5% | |
| hotmail.com | 121607 | 33.1% | |
| yahoo.com | 6503 | 1.8% | |
| yahoo.es | 5239 | 1.4% | |
| outlook.com | 4884 | 1.3% | |
| hotmail.es | 3638 | 1.0% | |
| yopmail.com | 3366 | 0.9% | |
| misena.edu.co | 2792 | 0.8% | |
| outlook.es | 1764 | 0.5% | |
| live.com | 1062 | 0.3% | |
| Other values (24641) | 56953 | 15.5% |
Frequencies of value counts
Unique
| Unique | 19738 ? |
|---|---|
| Unique (%) | 5.4% |
Histogram of lengths of the category
Length
| Max length | 41 |
|---|---|
| Median length | 10 |
| Mean length | 10.40228172 |
| Min length | 4 |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.97477869 |
|---|---|
| Minimum | -20 |
| Maximum | 20 |
| Zeros | 54067 |
| Zeros (%) | 14.7% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | -20 |
|---|---|
| 5-th percentile | -10 |
| Q1 | 9 |
| median | 20 |
| Q3 | 20 |
| 95-th percentile | 20 |
| Maximum | 20 |
| Range | 40 |
| Interquartile range (IQR) | 11 |
Descriptive statistics
| Standard deviation | 11.07176257 |
|---|---|
| Coefficient of variation (CV) | 0.7922674709 |
| Kurtosis | 1.376678552 |
| Mean | 13.97477869 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -1.609586954 |
| Sum | 5138596 |
| Variance | 122.5839265 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=6)
| Value | Count | Frequency (%) | |
| 20 | 274960 | 74.8% | |
| 0 | 54067 | 14.7% | |
| -10 | 16825 | 4.6% | |
| -20 | 12051 | 3.3% | |
| 1 | 4944 | 1.3% | |
| 9 | 4858 | 1.3% |
| Value | Count | Frequency (%) | |
| -20 | 12051 | 3.3% | |
| -10 | 16825 | 4.6% | |
| 0 | 54067 | 14.7% | |
| 1 | 4944 | 1.3% | |
| 9 | 4858 | 1.3% |
| Value | Count | Frequency (%) | |
| 20 | 274960 | 74.8% | |
| 9 | 4858 | 1.3% | |
| 1 | 4944 | 1.3% | |
| 0 | 54067 | 14.7% | |
| -10 | 16825 | 4.6% |
| Distinct | 23626 |
|---|---|
| Distinct (%) | 6.4% |
| Missing | 66 |
| Missing (%) | < 0.1% |
| Memory size | 2.8 MiB |
| 174XXXXX | 926 |
|---|---|
| 44XXXXX | 747 |
| 145XXXXX | 711 |
| 32XXXXX | 711 |
| 1174XXXXX | 685 |
| Other values (23621) |
| Value | Count | Frequency (%) | |
| 174XXXXX | 926 | 0.3% | |
| 44XXXXX | 747 | 0.2% | |
| 145XXXXX | 711 | 0.2% | |
| 32XXXXX | 711 | 0.2% | |
| 1174XXXXX | 685 | 0.2% | |
| 444XXXXX | 673 | 0.2% | |
| 0544XXXXX | 667 | 0.2% | |
| 33XXXXX | 633 | 0.2% | |
| 74XXXXX | 628 | 0.2% | |
| 175XXXXX | 619 | 0.2% | |
| Other values (23616) | 360639 | 98.1% |
Frequencies of value counts
Unique
| Unique | 9628 ? |
|---|---|
| Unique (%) | 2.6% |
Histogram of lengths of the category
Length
| Max length | 20 |
|---|---|
| Median length | 10 |
| Mean length | 9.636969854 |
| Min length | 3 |
| Distinct | 277 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 351.3182986 |
|---|---|
| Minimum | 0 |
| Maximum | 16393 |
| Zeros | 15822 |
| Zeros (%) | 4.3% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 6 |
| 95-th percentile | 1708 |
| Maximum | 16393 |
| Range | 16393 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 1692.108796 |
|---|---|
| Coefficient of variation (CV) | 4.816455057 |
| Kurtosis | 46.4272444 |
| Mean | 351.3182986 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 6.491427918 |
| Sum | 129181495 |
| Variance | 2863232.178 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1 | 164363 | 44.7% | |
| 2 | 46196 | 12.6% | |
| 3 | 23891 | 6.5% | |
| 0 | 15822 | 4.3% | |
| 4 | 14364 | 3.9% | |
| 5 | 9855 | 2.7% | |
| 6 | 6935 | 1.9% | |
| 7 | 5517 | 1.5% | |
| 8 | 4387 | 1.2% | |
| 9 | 3420 | 0.9% | |
| Other values (267) | 72955 | 19.8% |
| Value | Count | Frequency (%) | |
| 0 | 15822 | 4.3% | |
| 1 | 164363 | 44.7% | |
| 2 | 46196 | 12.6% | |
| 3 | 23891 | 6.5% | |
| 4 | 14364 | 3.9% |
| Value | Count | Frequency (%) | |
| 16393 | 988 | 0.3% | |
| 13023 | 1378 | 0.4% | |
| 12998 | 1037 | 0.3% | |
| 10762 | 1346 | 0.4% | |
| 7114 | 1232 | 0.3% |
| Distinct | 82 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 21764 |
| Missing (%) | 5.9% |
| Memory size | 2.8 MiB |
| Colombia | |
|---|---|
| United States | 2703 |
| Peru | 436 |
| Venezuela | 297 |
| Spain | 275 |
| Other values (77) | 1862 |
| Value | Count | Frequency (%) | |
| Colombia | 340368 | 92.6% | |
| United States | 2703 | 0.7% | |
| Peru | 436 | 0.1% | |
| Venezuela | 297 | 0.1% | |
| Spain | 275 | 0.1% | |
| Brazil | 218 | 0.1% | |
| Argentina | 211 | 0.1% | |
| Ecuador | 184 | 0.1% | |
| Chile | 165 | < 0.1% | |
| Mexico | 157 | < 0.1% | |
| Other values (72) | 927 | 0.3% | |
| (Missing) | 21764 | 5.9% |
Frequencies of value counts
Unique
| Unique | 23 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 33 |
|---|---|
| Median length | 8 |
| Mean length | 7.730797786 |
| Min length | 3 |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 21764 |
| Missing (%) | 5.9% |
| Memory size | 2.8 MiB |
| America | |
|---|---|
| Europa | 621 |
| Asia | 113 |
| Oceania | 34 |
| Africa | 3 |
| Value | Count | Frequency (%) | |
| America | 345170 | 93.9% | |
| Europa | 621 | 0.2% | |
| Asia | 113 | < 0.1% | |
| Oceania | 34 | < 0.1% | |
| Africa | 3 | < 0.1% | |
| (Missing) | 21764 | 5.9% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 6.760626045 |
| Min length | 3 |
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 283591 |
| Missing (%) | 77.1% |
| Memory size | 2.8 MiB |
| EMPRESARIO INDIVIDUAL | |
|---|---|
| SOCIEDAD COMERCIAL/INDUSTRIAL | |
| ENTIDAD FINANCIERA O DE SEGUROS | 2509 |
| ENTIDAD SIN ANIMO DE LUCRO | 2479 |
| ORGANISMO ESTATAL | 691 |
| Other values (4) | 683 |
| Value | Count | Frequency (%) | |
| EMPRESARIO INDIVIDUAL | 39912 | 10.9% | |
| SOCIEDAD COMERCIAL/INDUSTRIAL | 37840 | 10.3% | |
| ENTIDAD FINANCIERA O DE SEGUROS | 2509 | 0.7% | |
| ENTIDAD SIN ANIMO DE LUCRO | 2479 | 0.7% | |
| ORGANISMO ESTATAL | 691 | 0.2% | |
| HOLDING | 393 | 0.1% | |
| ENTIDAD EXTRANJERA | 275 | 0.1% | |
| SOCIEDAD NO COMERCIAL | 13 | < 0.1% | |
| INDUSTRIA / COMERCIO | 2 | < 0.1% | |
| (Missing) | 283591 | 77.1% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 31 |
|---|---|
| Median length | 3 |
| Mean length | 8.018055234 |
| Min length | 3 |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 283589 |
| Missing (%) | 77.1% |
| Memory size | 2.8 MiB |
| MC | |
|---|---|
| PQ | |
| GR | 5822 |
| MD | 5739 |
| SD | 1816 |
| Value | Count | Frequency (%) | |
| MC | 60626 | 16.5% | |
| PQ | 10113 | 2.8% | |
| GR | 5822 | 1.6% | |
| MD | 5739 | 1.6% | |
| SD | 1816 | 0.5% | |
| (Missing) | 283589 | 77.1% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 2.771240532 |
| Min length | 2 |
| Distinct | 595 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 283589 |
| Missing (%) | 77.1% |
| Memory size | 2.8 MiB |
| G4711 | 2590 |
|---|---|
| I5611 | 2004 |
| M7110 | 1940 |
| M7020 | 1717 |
| G4771 | 1609 |
| Other values (590) |
| Value | Count | Frequency (%) | |
| G4711 | 2590 | 0.7% | |
| I5611 | 2004 | 0.5% | |
| M7110 | 1940 | 0.5% | |
| M7020 | 1717 | 0.5% | |
| G4771 | 1609 | 0.4% | |
| G4773 | 1553 | 0.4% | |
| N8299 | 1530 | 0.4% | |
| I5630 | 1527 | 0.4% | |
| F4290 | 1464 | 0.4% | |
| S9499 | 1441 | 0.4% | |
| Other values (585) | 66741 | 18.2% | |
| (Missing) | 283589 | 77.1% |
Frequencies of value counts
Unique
| Unique | 69 ? |
|---|---|
| Unique (%) | 0.1% |
Histogram of lengths of the category
Length
| Max length | 5 |
|---|---|
| Median length | 3 |
| Mean length | 3.456112917 |
| Min length | 3 |
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 283589 |
| Missing (%) | 77.1% |
| Memory size | 2.8 MiB |
| ACTIVA | |
|---|---|
| CANCELACIÓN | |
| LIQUIDACION | 2362 |
| LEY DE INSOLVENCIA (REORGANIZACION EMPRESARIAL) | 500 |
| EXTINGUIDA | 275 |
| Other values (6) | 323 |
| Value | Count | Frequency (%) | |
| ACTIVA | 67580 | 18.4% | |
| CANCELACIÓN | 13076 | 3.6% | |
| LIQUIDACION | 2362 | 0.6% | |
| LEY DE INSOLVENCIA (REORGANIZACION EMPRESARIAL) | 500 | 0.1% | |
| EXTINGUIDA | 275 | 0.1% | |
| INACTIVA TEMPORAL | 261 | 0.1% | |
| REESTRUCTURACION O CONCORDATO | 32 | < 0.1% | |
| INTERVENIDA | 9 | < 0.1% | |
| COINCIDENCIA HOMOGRAFA LISTA CLINTON (SDNT OFAC) | 9 | < 0.1% | |
| SALIDA CLINTON (SDNT OFAC) | 7 | < 0.1% | |
| (Missing) | 283589 | 77.1% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 48 |
|---|---|
| Median length | 3 |
| Mean length | 3.966489441 |
| Min length | 3 |
| Distinct | 33 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 277197 |
| Missing (%) | 75.4% |
| Memory size | 2.8 MiB |
| BOGOTA | |
|---|---|
| ANTIOQUIA | |
| VALLE | |
| CUNDINAMARCA | |
| ATLANTICO | |
| Other values (28) |
| Value | Count | Frequency (%) | |
| BOGOTA | 33307 | 9.1% | |
| ANTIOQUIA | 11168 | 3.0% | |
| VALLE | 7312 | 2.0% | |
| CUNDINAMARCA | 4933 | 1.3% | |
| ATLANTICO | 4284 | 1.2% | |
| SANTANDER | 3888 | 1.1% | |
| BOYACA | 2170 | 0.6% | |
| BOLIVAR | 2100 | 0.6% | |
| RISARALDA | 2085 | 0.6% | |
| NORTE SANTANDER | 2004 | 0.5% | |
| Other values (23) | 17257 | 4.7% | |
| (Missing) | 277197 | 75.4% |
Frequencies of value counts
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 15 |
|---|---|
| Median length | 3 |
| Mean length | 4.036072395 |
| Min length | 3 |
| Distinct | 810 |
|---|---|
| Distinct (%) | 31.0% |
| Missing | 365092 |
| Missing (%) | 99.3% |
| Memory size | 2.8 MiB |
| Minimum | 2018-01-02 00:00:00 |
|---|---|
| Maximum | 2021-04-01 00:00:00 |
Histogram with fixed size bins (bins=50)
| Distinct | 902 |
|---|---|
| Distinct (%) | 24.3% |
| Missing | 363994 |
| Missing (%) | 99.0% |
| Memory size | 2.8 MiB |
| Minimum | 2018-01-02 00:00:00 |
|---|---|
| Maximum | 2021-11-01 00:00:00 |
Histogram with fixed size bins (bins=50)
| Distinct | 154 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.011647924 |
|---|---|
| Minimum | 0 |
| Maximum | 3206 |
| Zeros | 250633 |
| Zeros (%) | 68.2% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 5 |
| Maximum | 3206 |
| Range | 3206 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 9.103286641 |
|---|---|
| Coefficient of variation (CV) | 8.998473107 |
| Kurtosis | 64629.21345 |
| Mean | 1.011647924 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 223.0141021 |
| Sum | 371988 |
| Variance | 82.86982767 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 250633 | 68.2% | |
| 1 | 54529 | 14.8% | |
| 2 | 22148 | 6.0% | |
| 3 | 11217 | 3.1% | |
| 4 | 7327 | 2.0% | |
| 5 | 6525 | 1.8% | |
| 6 | 5205 | 1.4% | |
| 7 | 2688 | 0.7% | |
| 8 | 1733 | 0.5% | |
| 9 | 1094 | 0.3% | |
| Other values (144) | 4606 | 1.3% |
| Value | Count | Frequency (%) | |
| 0 | 250633 | 68.2% | |
| 1 | 54529 | 14.8% | |
| 2 | 22148 | 6.0% | |
| 3 | 11217 | 3.1% | |
| 4 | 7327 | 2.0% |
| Value | Count | Frequency (%) | |
| 3206 | 1 | < 0.1% | |
| 2342 | 1 | < 0.1% | |
| 2062 | 1 | < 0.1% | |
| 1659 | 1 | < 0.1% | |
| 1085 | 1 | < 0.1% |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.153185298 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 24290 |
| Zeros (%) | 6.6% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.8747251344 |
|---|---|
| Coefficient of variation (CV) | 0.758529558 |
| Kurtosis | 11.63350115 |
| Mean | 1.153185298 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.293654164 |
| Sum | 424032 |
| Variance | 0.7651440608 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=6)
| Value | Count | Frequency (%) | |
| 1 | 313236 | 85.2% | |
| 0 | 24290 | 6.6% | |
| 5 | 12846 | 3.5% | |
| 2 | 8991 | 2.4% | |
| 3 | 4784 | 1.3% | |
| 4 | 3558 | 1.0% |
| Value | Count | Frequency (%) | |
| 0 | 24290 | 6.6% | |
| 1 | 313236 | 85.2% | |
| 2 | 8991 | 2.4% | |
| 3 | 4784 | 1.3% | |
| 4 | 3558 | 1.0% |
| Value | Count | Frequency (%) | |
| 5 | 12846 | 3.5% | |
| 4 | 3558 | 1.0% | |
| 3 | 4784 | 1.3% | |
| 2 | 8991 | 2.4% | |
| 1 | 313236 | 85.2% |
| Distinct | 161 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.676974749 |
|---|---|
| Minimum | 0 |
| Maximum | 1307 |
| Zeros | 170625 |
| Zeros (%) | 46.4% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 7 |
| Maximum | 1307 |
| Range | 1307 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 4.868763107 |
|---|---|
| Coefficient of variation (CV) | 2.903301383 |
| Kurtosis | 15831.76257 |
| Mean | 1.676974749 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 77.6662565 |
| Sum | 616632 |
| Variance | 23.70485419 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 170625 | 46.4% | |
| 1 | 81097 | 22.1% | |
| 2 | 41029 | 11.2% | |
| 3 | 23724 | 6.5% | |
| 4 | 15059 | 4.1% | |
| 5 | 10228 | 2.8% | |
| 6 | 6875 | 1.9% | |
| 7 | 4784 | 1.3% | |
| 8 | 3435 | 0.9% | |
| 9 | 2440 | 0.7% | |
| Other values (151) | 8409 | 2.3% |
| Value | Count | Frequency (%) | |
| 0 | 170625 | 46.4% | |
| 1 | 81097 | 22.1% | |
| 2 | 41029 | 11.2% | |
| 3 | 23724 | 6.5% | |
| 4 | 15059 | 4.1% |
| Value | Count | Frequency (%) | |
| 1307 | 1 | < 0.1% | |
| 559 | 1 | < 0.1% | |
| 481 | 1 | < 0.1% | |
| 478 | 1 | < 0.1% | |
| 437 | 1 | < 0.1% |
| Distinct | 225 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 3 |
| Missing (%) | < 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.314771201 |
|---|---|
| Minimum | 1 |
| Maximum | 6346 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 3 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 16 |
| Maximum | 6346 |
| Range | 6345 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 17.345244 |
|---|---|
| Coefficient of variation (CV) | 3.263591855 |
| Kurtosis | 65626.75611 |
| Mean | 5.314771201 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 218.6421765 |
| Sum | 1954252 |
| Variance | 300.8574895 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 3 | 222788 | 60.6% | |
| 5 | 38327 | 10.4% | |
| 4 | 34009 | 9.2% | |
| 6 | 14120 | 3.8% | |
| 7 | 11584 | 3.2% | |
| 8 | 5869 | 1.6% | |
| 9 | 5474 | 1.5% | |
| 10 | 3692 | 1.0% | |
| 11 | 2658 | 0.7% | |
| 13 | 2335 | 0.6% | |
| Other values (215) | 26846 | 7.3% |
| Value | Count | Frequency (%) | |
| 1 | 4 | < 0.1% | |
| 2 | 2196 | 0.6% | |
| 3 | 222788 | 60.6% | |
| 4 | 34009 | 9.2% | |
| 5 | 38327 | 10.4% |
| Value | Count | Frequency (%) | |
| 6346 | 1 | < 0.1% | |
| 4017 | 1 | < 0.1% | |
| 3696 | 1 | < 0.1% | |
| 3154 | 1 | < 0.1% | |
| 2033 | 1 | < 0.1% |
díasHastaCliente
Real number (ℝ)
| Distinct | 1312 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1218.82716 |
|---|---|
| Minimum | -322 |
| Maximum | 1595 |
| Zeros | 1395 |
| Zeros (%) | 0.4% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | -322 |
|---|---|
| 5-th percentile | 900 |
| Q1 | 1028 |
| median | 1217 |
| Q3 | 1423 |
| 95-th percentile | 1551 |
| Maximum | 1595 |
| Range | 1917 |
| Interquartile range (IQR) | 395 |
Descriptive statistics
| Standard deviation | 235.0874545 |
|---|---|
| Coefficient of variation (CV) | 0.1928800589 |
| Kurtosis | 2.544942124 |
| Mean | 1218.82716 |
| Median Absolute Deviation (MAD) | 197 |
| Skewness | -0.7623160509 |
| Sum | 448168841 |
| Variance | 55266.11126 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 1395 | 0.4% | |
| 1523 | 976 | 0.3% | |
| 1516 | 967 | 0.3% | |
| 1517 | 964 | 0.3% | |
| 1539 | 948 | 0.3% | |
| 1524 | 940 | 0.3% | |
| 1453 | 937 | 0.3% | |
| 1448 | 932 | 0.3% | |
| 1412 | 931 | 0.3% | |
| 1490 | 928 | 0.3% | |
| Other values (1302) | 357787 | 97.3% |
| Value | Count | Frequency (%) | |
| -322 | 1 | < 0.1% | |
| -320 | 1 | < 0.1% | |
| -313 | 1 | < 0.1% | |
| -302 | 1 | < 0.1% | |
| -291 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1595 | 45 | < 0.1% | |
| 1594 | 473 | 0.1% | |
| 1593 | 804 | 0.2% | |
| 1592 | 280 | 0.1% | |
| 1591 | 315 | 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| IDUSUARIO | TIPOUSUARIO | FEC_REGISTRO | CANAL_REGISTRO | IND_CLIENTE | IND_ALTA | TIPOEMAIL | BONDAD_EMAIL | USU_TELF | IPCASOS | IP_Country | IP_Area | USU_TIPO | USU_TAMANIO | USU_CIIU | USU_ESTADO | USU_DEPARTAMENTO | FEC_CLIENTE | FEC_ALTA | Ficha Básica | Perfil Promocional | N_logins | N_sesiones | díasHastaCliente | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 8107310 | PF | 2019-10-22 | 3.0 | 0 | 0 | yahoo.com | 0 | 233XXXXX | 1 | Colombia | America | NaN | NaN | NaN | NaN | NaN | NaT | NaT | 0.0 | 1.0 | 1.0 | 5.0 | 936 |
| 1 | 7784565 | PJ | 2019-05-14 | 3.0 | 0 | 0 | gmail.com | 20 | 633XXXXX | 1 | Colombia | America | SOCIEDAD COMERCIAL/INDUSTRIAL | PQ | N7820 | ACTIVA | QUINDIO | NaT | NaT | 0.0 | 1.0 | 3.0 | 3.0 | 1097 |
| 2 | 7718778 | PJ | 2019-09-04 | 7.0 | 0 | 0 | hotmail.com | 20 | 533XXXXX | 1 | Colombia | America | SOCIEDAD COMERCIAL/INDUSTRIAL | MC | G4774 | ACTIVA | ATLANTICO | NaT | NaT | 0.0 | 1.0 | 0.0 | 3.0 | 984 |
| 3 | 7952765 | PX | 2019-12-08 | 3.0 | 0 | 0 | uqvirtual.edu.co | 20 | 633XXXXX | 1 | Colombia | America | NaN | NaN | NaN | NaN | NaN | NaT | NaT | 0.0 | 1.0 | 1.0 | 3.0 | 889 |
| 4 | 7855424 | PJ | 2019-06-21 | 7.0 | 0 | 0 | hotmail.com | 20 | 533XXXXX | 1 | Colombia | America | EMPRESARIO INDIVIDUAL | MC | N8299 | CANCELACIÓN | ATLANTICO | NaT | NaT | 0.0 | 1.0 | 0.0 | 3.0 | 1059 |
| 5 | 8031418 | PX | 2019-09-18 | 3.0 | 0 | 0 | gmail.com | 20 | 633XXXXX | 2 | Colombia | America | NaN | NaN | NaN | NaN | NaN | NaT | NaT | 1.0 | 1.0 | 3.0 | 5.0 | 970 |
| 6 | 8189769 | PJ | 2019-11-27 | 3.0 | 0 | 0 | yahoo.com | 20 | 233XXXXX | 1 | Colombia | America | HOLDING | MC | M7010 | ACTIVA | VALLE | NaT | NaT | 0.0 | 1.0 | 1.0 | 5.0 | 900 |
| 7 | 7658143 | PJ | 2019-12-03 | 2.0 | 0 | 0 | marval.com.co | 20 | 63XXXXX | 4 | Colombia | America | SOCIEDAD COMERCIAL/INDUSTRIAL | GR | F4111 | ACTIVA | SANTANDER | NaT | NaT | 0.0 | 1.0 | 0.0 | 3.0 | 894 |
| 8 | 7970569 | PJ | 2019-08-21 | 7.0 | 0 | 0 | gmail.com | 20 | 533XXXXX | 1 | Colombia | America | SOCIEDAD COMERCIAL/INDUSTRIAL | MC | G4752 | ACTIVA | ATLANTICO | NaT | NaT | 0.0 | 1.0 | 0.0 | 3.0 | 998 |
| 9 | 7594738 | PJ | 2019-02-14 | 7.0 | 0 | 0 | amparoldeb.com | 20 | 633XXXXX | 1 | Colombia | America | ENTIDAD FINANCIERA O DE SEGUROS | MC | K6621 | ACTIVA | RISARALDA | NaT | NaT | 0.0 | 1.0 | 2.0 | 3.0 | 1186 |
Last rows
| IDUSUARIO | TIPOUSUARIO | FEC_REGISTRO | CANAL_REGISTRO | IND_CLIENTE | IND_ALTA | TIPOEMAIL | BONDAD_EMAIL | USU_TELF | IPCASOS | IP_Country | IP_Area | USU_TIPO | USU_TAMANIO | USU_CIIU | USU_ESTADO | USU_DEPARTAMENTO | FEC_CLIENTE | FEC_ALTA | Ficha Básica | Perfil Promocional | N_logins | N_sesiones | díasHastaCliente | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 367695 | 8188180 | PF | 2019-11-27 | 1.0 | 0 | 0 | hotmail.com | 0 | NaN | 3225 | Colombia | America | NaN | NaN | NaN | NaN | NaN | NaT | NaT | 1.0 | 1.0 | 1.0 | 4.0 | 900 |
| 367696 | 8103216 | PF | 2019-10-21 | 1.0 | 0 | 0 | gmail.com | 20 | NaN | 4 | Colombia | America | NaN | NaN | NaN | NaN | NaN | NaT | NaT | 2.0 | 2.0 | 5.0 | 11.0 | 937 |
| 367697 | 8205258 | PF | 2019-05-12 | 1.0 | 0 | 0 | gmail.com | 0 | NaN | 373 | Colombia | America | NaN | NaN | NaN | NaN | NaN | NaT | NaT | 13.0 | 5.0 | 3.0 | 34.0 | 1099 |
| 367698 | 8108800 | PF | 2019-10-23 | 1.0 | 0 | 0 | hotmail.com | 20 | NaN | 1806 | Colombia | America | NaN | NaN | NaN | NaN | NaN | NaT | NaT | 2.0 | 1.0 | 3.0 | 9.0 | 935 |
| 367699 | 8136973 | PF | 2019-05-11 | 1.0 | 0 | 0 | hotmail.com | 0 | NaN | 1806 | Colombia | America | NaN | NaN | NaN | NaN | NaN | NaT | NaT | 11.0 | 5.0 | 4.0 | 25.0 | 1100 |
| 367700 | 8141168 | PF | 2019-06-11 | 1.0 | 0 | 0 | hotmail.com | 0 | NaN | 1806 | Colombia | America | NaN | NaN | NaN | NaN | NaN | NaT | NaT | 3.0 | 2.0 | 1.0 | 9.0 | 1069 |
| 367701 | 8147354 | PF | 2019-08-11 | 1.0 | 0 | 0 | hotmail.com | 0 | NaN | 1806 | Colombia | America | NaN | NaN | NaN | NaN | NaN | NaT | NaT | 5.0 | 4.0 | 5.0 | 14.0 | 1008 |
| 367702 | 8153565 | PF | 2019-12-11 | 1.0 | 0 | 0 | hotmail.com | 0 | NaN | 1806 | Colombia | America | NaN | NaN | NaN | NaN | NaN | NaT | NaT | 5.0 | 3.0 | 3.0 | 14.0 | 886 |
| 367703 | 8169002 | PF | 2019-11-18 | 1.0 | 0 | 0 | gesticobranzas.com | 9 | NaN | 1806 | Colombia | America | NaN | NaN | NaN | NaN | NaN | NaT | NaT | 3.0 | 1.0 | 1.0 | 8.0 | 909 |
| 367704 | 8205187 | PF | 2019-05-12 | 1.0 | 0 | 0 | hotmail.com | 0 | NaN | 1806 | Colombia | America | NaN | NaN | NaN | NaN | NaN | NaT | NaT | 5.0 | 4.0 | 3.0 | 15.0 | 1099 |